A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research

نویسندگان

  • Emre Yilmaz
  • Maaike Andringa
  • Sigrid Kingma
  • Jelske Dijkstra
  • Frits Van der Kuip
  • Hans Van de Velde
  • Frederik Kampstra
  • Jouke Algra
  • Henk van den Heuvel
  • David A. van Leeuwen
چکیده

We present a new speech database containing 18.5 hours of annotated radio broadcasts in the Frisian language. Frisian is mostly spoken in the province Fryslân and it is the second official language of the Netherlands. The recordings are collected from the archives of Omrop Fryslân, the regional public broadcaster of the province Fryslân. The database covers almost a 50-year time span. The native speakers of Frisian are mostly bilingual and often code-switch in daily conversations due to the extensive influence of the Dutch language. Considering the longitudinal and code-switching nature of the data, an appropriate annotation protocol has been designed and the data is manually annotated with the orthographic transcription, speaker identities, dialect information, code-switching details and background noise/music information.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Longitudinal Speaker Clustering and Verification Corpus with Code-Switching Frisian-Dutch Speech

In this paper, we present a new longitudinal and bilingual broadcast database designed for speaker clustering and textindependent verification research. The broadcast data is extracted from the archives of Omrop Fryslân which is the regional broadcaster in the province of Fryslân, located in the north of the Netherlands. Two speaker verification tasks are provided in a standard enrollment-test ...

متن کامل

Open Source Speech and Language Resources for Frisian

In this paper, we present several open source speech and language resources for the under-resourced Frisian language. Frisian is mostly spoken in the province of Fryslân which is located in the north of the Netherlands. The native speakers of Frisian are Frisian-Dutch bilingual and often code-switch in daily conversations. The resources presented in this paper include a code-switching speech da...

متن کامل

Age of acquisition and naming performance in Frisian-Dutch bilingual speakers with dementia

Age of acquisition (AoA) of words is a recognised variable affecting language processing in speakers with and without language disorders. For bi- and multilingual speakers their languages can be differentially affected in neurological illness. Study of language loss in bilingual speakers with dementia has been relatively neglected. Objective We investigated whether AoA of words was associated...

متن کامل

Investigating Bilingual Deep Neural Networks for Automatic Recognition of Code-switching Frisian Speech

In this paper, a code-switching automatic speech recognition (ASR) system built for the Frisian language is described. Frisian is mostly spoken in the province Fryslân which is located in the north of the Netherlands. The native speakers of Frisian are mostly bilingual and often code-switch in daily conversations due to the extensive influence of the Dutch language. In the scope of the FAME! Pr...

متن کامل

Exploiting Untranscribed Broadcast Data for Improved Code-Switching Detection

We have recently presented an automatic speech recognition (ASR) system operating on Frisian-Dutch code-switched speech. This type of speech requires careful handling of unexpected language switches that may occur in a single utterance. In this paper, we extend this work by using some raw broadcast data to improve multilingually trained deep neural networks (DNN) that have been trained on 11.5 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016